NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

SpotVerse: Optimizing Bioinformatics Workflows with Multi-Region Spot Instances in Galaxy and Beyond

https://doi.org/10.1145/3652892.3700750

Son, Myungjun; Akbulut, Gulsum Gudukbay; Kandemir, Mahmut Taylan (December 2024, ACM)

Full Text Available
Load and MLP-Aware Thread Orchestration for Recommendation Systems Inference on CPUs

https://doi.org/10.1145/3676641.3716003

Jain, Rishabh; Chou, Teyuh; Kayiran, Onur; Kalamatianos, John; Loh, Gabriel H; Kandemir, Mahmut T; Das, Chita R (March 2025, ACM)

Free, publicly-accessible full text available March 30, 2026
FAAStloop: Optimizing Loop-Based Applications for Serverless Computing

https://doi.org/10.1145/3698038.3698560

Mohanty, Shruti; Bhasi, Vivek M; Son, Myungjun; Kandemir, Mahmut Taylan; Das, Chita (November 2024, ACM)

Full Text Available
Pirate: No Compromise Low-Bandwidth VR Streaming for Edge Devices

https://doi.org/10.1145/3676641.3716268

Zhang, Yingtian; Kang, Yan; Ying, Ziyu; Lu, Wanhang; Lan, Sijie; Xu, Huijuan; Maeng, Kiwan; Sivasubramaniam, Anand; Kandemir, Mahmut T; Das, Chita R (March 2025, ACM)

Free, publicly-accessible full text available March 30, 2026
GameStreamSR: Enabling Neural-Augmented Game Streaming on Commodity Mobile Platforms

https://doi.org/10.1109/ISCA59077.2024.00097

Bhuyan, Sandeepa; Ying, Ziyu; Kandemir, Mahmut T; Gowda, Mahanth; Das, Chita R (June 2024, IEEE)

Full Text Available
Paldia: Enabling SLO-Compliant and Cost-Effective Serverless Computing on Heterogeneous Hardware

https://doi.org/10.1109/IPDPS57955.2024.00018

Bhasi, Vivek M; Sharma, Aakash; Mohanty, Shruti; Kandemir, Mahmut Taylan; Das, Chita R (May 2024, IEEE)

Among the variety of applications (apps) being deployed on serverless platforms, apps such as Machine Learning (ML) inference serving can achieve better performance from leveraging accelerators like GPUs. Yet, major serverless providers, despite having GPU-equipped servers, do not offer GPU support for their serverless functions. Given that serverless functions are deployed on various generations of CPUs already, extending this to various (typically more expensive) GPU generations can offer providers a greater range of hardware to serve incoming requests according to the functions and request traffic. Here, providers are faced with the challenge of selecting hardware to reach a well-proportioned trade-off point between cost and performance. While recent works have attempted to address this, they often fail to do so as they overlook optimization opportunities arising from intelligently leveraging existing GPU sharing mechanisms. To address this point, we devise a heterogeneous serverless framework, PALDIA, which uses a prudent Hardware selection policy to acquire capable, costeffective hardware and perform intelligent request scheduling on it to yield high performance and cost savings. Specifically, our scheduling algorithm employs hybrid spatio-temporal GPU sharing that intelligently trades off job queueing delays and interference to allow the chosen cost-effective hardware to also be highly performant. We extensively evaluate PALDIA using 16 ML inference workloads with real-world traces on a 6 node heterogeneous cluster. Our results show that PALDIA significantly outperforms state-of-the-art works in terms of Service Level Objective (SLO) compliance (up to 13.3% more) and tail latency (up to ∼50% less), with cost savings up to 86%.
more » « less
Full Text Available
Usas: A Sustainable Continuous-Learning Framework for Edge Servers

https://doi.org/10.1109/HPCA57654.2024.00073

Mishra, Cyan Subhra; Sampson, Jack; Kandemir, Mahmut Taylan; Narayanan, Vijaykrishnan; Das, Chita R (March 2024, IEEE)

Edge servers have recently become very popular for performing localized analytics, especially on video, as they reduce data traffic and protect privacy. However, due to their resource constraints, these servers often employ compressed models, which are typically prone to data drift. Consequently, for edge servers to provide cloud-comparable quality, they must also perform continuous learning to mitigate this drift. However, at expected deployment scales, performing continuous training on every edge server is not sustainable due to their aggregate power demands on grid supply and associated sustainability footprints. To address these challenges, we propose Us.as,´ an approach combining algorithmic adjustments, hardware-software co-design, and morphable acceleration hardware to enable the training of workloads on these edge servers to be powered by renewable, but intermittent, solar power that can sustainably scale alongside data sources. Our evaluation of Us.as on a real-world´ traffic dataset indicates that our continuous learning approach simultaneously improves both accuracy and efficiency: Us.as´ offers a 4.96% greater mean accuracy than prior approaches while our morphable accelerator that adapts to solar variance can save up to {234.95kWH, 2.63MWH}/year/edge-server compared to a {DNN accelerator, data center scale GPU}, respectively.
more » « less
Full Text Available
Studying CPU and memory utilization of applications on Fujitsu A64FX and Nvidia Grace Superchip

https://doi.org/10.1145/3695794.3695813

Kang, Yan; Ghosh, Sayan; Kandemir, Mahmut; Márquez, Andrés (September 2024, ACM)

Full Text Available
License Forecasting and Scheduling for HPC

https://doi.org/10.1109/MASCOTS59514.2023.10387539

Gulhan, Ahmed Burak; Akbulut, Gulsum Gudukbay; Amritkar, Amit; Sampson, Jack; Honovar, Vasant; Focht, Adam; Pavloski, Chuck; Kandemir, Mahmut (October 2023, IEEE)

This work focuses on forecasting future license usage for high-performance computing environments and using such predictions to improve the effectiveness of job scheduling. Specifically, we propose a model that carries out both short-term and long-term license usage forecasting and a method of using forecasts to improve job scheduling. Our long-term forecasting model achieves a Mean Absolute Percentage Error (MAPE) as low as 0.26 for a 12-month forecast of daily peak license usage. Our job scheduling experimental results also indicate that wasted work from jobs with insufficient licenses can be reduced by up to 92% without increasing the average license-using job completion times, during periods of high license usage, with our proposed license-aware scheduler.
more » « less
Full Text Available
TRIM: crossTalk-awaRe qubIt Mapping for multiprogrammed quantum systems

https://doi.org/10.1109/QSW59989.2023.00025

Khadirsharbiyani, Soheil; Sadeghi, Movahhed; Zarch, Mostafa Eghbali; Kotra, Jagadish; Kandemir, Mahmut Taylan (July 2023, IEEE International Conference on Quantum Software (QSW))

Full Text Available

« Prev Next »

Search for: All records